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ABSTRACT 

We present an all-sky sample of ~ 1.4 million AGNs meeting a two color infrared photometric 
selection criteria for AGNs as applied to sources from the Wide-Field Infrared Survey Explorer final 
catalog release (AllWISE). We assess the spatial distribution and optical properties of our sample and 
find that the results are consistent with expectations for AGNs. These sources have a mean density of 
~ 38 AGNs per square degree on the sky, and their apparent magnitude distribution peaks at ^ ~ 20, 
extending to objects as faint as ^ ~ 26. We test the AGN selection criteria against a large sample of 
optically-identified stars and determine the “leakage” (that is, the probability that a star detected in 
an optical survey will be misidentified as a QSO in our sample) rate to be <4.Ox 10“^. We conclude 
that our sample contains almost no optically-identified stars (< 0.041%), making this sample highly 
promising for future celestial reference frame work by significantly increasing the number of all-sky, 
compact extragalactic objects. We further compare our sample to catalogs of known AGNs/QSOs 
and find a completeness value of > 84% (that is, the probability of correctly identifying a known 
AGN/QSO is at least 84%) for AGNs brighter than a limiting magnitude of R < 19. Our sample 
includes approximately 1.1 million previously uncatalogued AGNs. 

Subjeet headings: catalogs — infrared: galaxies — galaxies: active — quasars: general — astrometry 
— infrared: stars 


1. INTRODUCTION 

The International Celestial Reference Frame (ICRF) 
is the realization, at radio wavelengths, of the Inter¬ 
national Celestial Reference System (ICRS), the solar- 
barycentric, quasi-inertial fundamental reference sys- 
tem adopted by t he International Astronomical Union 
(|Arias et al.lll995l ). The second realization, ICRF2, con¬ 
sists of 3,414 compact radio o bjects (i.e., QS Os), of 
which 295 are “defining sources” (|Fev et al.ll2Q15f ). These 
QSOs can be nearly ideal reference frame objects, as 
they present no significant parallax or proper motion 
and, when properly selected, they have minimal spatial 
structure or variability. Using Very Long Baseline In¬ 
terferometry (VLBI) techniques, ICRF2 defining source 
position errors have an estimated noise floor of 40 /ias. 
Plans for the third realization of the ICRF (ICRF3), to 
be released in the 2018 timeframe, focus on further den- 
sifying the source catalog and improving spatial unifor¬ 
mity (especially at negative declinations), improving the 
astrometric accuracy of the non-defining sources to bring 
them close to the accuracy of the defining sources, and 
extending of the ICRF to higher freque ncies to reduce the 
effects of source structure on position (I Jacobs fc ICRF-J 
1 ^ . 

Much of the motivation for this improvement to the 
radio reference frame is driven by the Furopean Space 
Agency’s (FSA) Gaia mission, launched in 2013. Gaia is 
a space-based, astrometric, photometric and radial veloc¬ 
ity all-sky survey at optical wavelengths. Over the next 
few years, Gaia will deliver astrometric catalogs that are 
expected to be adopted as the next optical instantiation 
of the fundamental reference frame. Unlike its prede- 
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cessor, Hippareos^ which was limited to observing stars 
only. Gam, with its limiting magnitude of U ~ 20, will 
directly observe hundreds of thousands of extragalactic 
objects. By tying together the positions of reference ob¬ 
jects (i.e., QSOs) observed in both radio and optical, the 
Gaia reference frame will be brought into alignment and 
rotationally stabilized with respect to the radio reference 
frame. 

One critical limitation in aligning optical and radio ref¬ 
erence frames is the problem of discrepancies (or “off¬ 
sets”) in position between radio and optical measure¬ 
ments. Offsets can be due to a variety of underlying 
physical differences between the emission mechanisms of 
AGNs in the optical and in the radio. First, for AGN- 
dominated galaxies, optical emission is thought to orig¬ 
inate from the compact accretion disk surrounding the 
supermassive black hole (SMBH), while radio emission 
can be either compact or extended, depending on the 
orientation of the jet with respect to the observer. Sec¬ 
ond, for non AGN-dominated galaxies, an optical cen¬ 
troid can be shifted relative to the radio position because 
of contamination by the host galaxy. Depending on the 
distance to the source, the optical position can be sig¬ 
nificantly different from the position of a compact radio 
core. Third, variability in either the jet or the accretion 
disk can cause apparent changes in the overall position 
of the AGN in the optical or the radio over time, making 
the AGN an unreliable tie source for use in defining the 
reference frame. 

Position offsets as a function of wavelength in this 
class of sources have been e xtens ively studied, beginning 
with Ida Silva Neto et ^ (|2QQ2[) on the correlation be¬ 
tween offsets and X-band (8.4 GHz) structure in ICRF 
sources, and extending to large systematic analyses of ra¬ 
dio sources overlapping with the Sloan Digital Sky Sur- 
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vey (SDSS) (jOrosz fc Frevll2Ql^ . Recent optical obser¬ 
vations of ICRF sources found offsets between optical 
and radio positions at the 10 mas level, leaving the au¬ 
thors to conclude that unknown effects, most likely hav¬ 
ing to do with underlying astrophysical phenomena, in¬ 
duce this real offset between radio and optical positions. 
Such an offset would introduce a fundamental limit of 
about 0.5 mas to the accuracy of the alignment be tween 
Gaia and the ICRF (|Zacharias fc ZachariasI [201^ . and 
would be exacerbated should the sources be time vary¬ 
ing in nature. These offsets can potentially be minimized 
by selecting reference frame objects that minimize prob¬ 
lematic features, such as photometric variability or opti¬ 
cal/radio structure. 

It is thus crucial to identify and characterize as many 
AGNs/QSOs as possible in order to maximize the num¬ 
ber of reference frame tie objects. Until recently, the 
number of known QSOs was quite limited, of order 
a few tens of thousands over the entire sky. The 
SDSS-DR12 Quasar Catalog (DRI2Q), using data from 
the Baryon Oscilla tion Spectroscopic Survey (BOSS; 
iDawson et al.l |2QI3[) and photometrically-selected QSO 
candidates from SDSS, contains 297,301 spectroscopi¬ 
cally confirmed QSOs over approximately 9,200 deg^ 
22% of the sky; Isabelle Paris 2015, private com¬ 
munication). Gaia is expected to observe approxi- 
matel y half a million QSOs over the course of its mis¬ 
sion |ciaeskens et al ] 120061: iMignardI |20I2[) . A large, 
all-sky, coherent catalog of well-defined zero-parallax, 
zero-proper motion sources that includes but also ex¬ 
tends beyond the network of ICRF sources would per¬ 
mit an extensive study of all physical offsets associ¬ 
ated with radio and optical reference frames, object po¬ 
sitional stability, and object variability, using ground- 
based resources currently available, such as the United 
States Naval Observat ory (USNO) Robotic Astromet¬ 
ric T elescope (URAT, IZachariasI I2QQ5I: IZacharias et al.l 
iMi) . and the Panoramic Survey T elescope and Rapid 
Response System (P an-STARRS, iKaiser et al.l I2QIQI: 
iMorganson et al.ll2Q14D . 

In order to identify and characterize as many 
AGNs/QSOs uniformly selected across the sky as pos¬ 
sible, we apply a two-color AGN selection criterion 
to the Wide-Fiel d Infrared Survey Explorer ( WISE^ 

I Wright et al.ll2QlQl) database. We describe our methodol¬ 
ogy in the following sections, and discuss the properties 
of the resultant sample and the reliability of the selec¬ 
tion criteria we have chosen for detecting AGNs. Our 
resultant sample contains ^ lA million AGNs, of which 
^ I.I million are previously uncatalogued, and most are 
compact (subtending < I" - 2"). The goal of generat¬ 
ing this sample is to provide a large sample of point¬ 
like extragalactic objects with minimal stellar contami¬ 
nation for the purpose of supporting future photometric, 
astrometric, and variability studies; maintenance and im¬ 
provements of the celestial reference frame; and general 
astrophysical and cosmological uses. This paper is out¬ 
lined as follows: In §2] we review the WISE mission and 
the AllWISE source catalog and its properties; in ^we 
detail the AGN/QSO selection criteria we use; finally in 
a we discuss the resultant sample, as well as some of 
the properties of the sample sources derived using the 


Sloan Digital Sky Survey^ data release 12 (SDSS-DRI2, 
lAlam et al.ll2QI5li . 

2. ALLWISE CATALOG 

The WISE survey is an all-sky mid-IR survey at 3.4, 
4.6, 12, and 22 /im (lUI, IU2, IU3, and IU4, respectively), 
conducted between January 7 and August 6, 2010, dur¬ 
ing the cryogenic mission phase, and first made avail¬ 
able to the public on April 14, 2011. WISE has an 
angular resolution of 6.1", 6.4", 6.5", and 12.0" in its 
four bands. The AllWISE data release, which we use 
for this work, incorporates data from the WISE Eull 
Cry ogenic, 3-Band Gryo, and NEOWISE Post-Cryo sur¬ 
vey (|Mainzer et al.l 1^0141) phases, which were coadded to 
achieve a depth of coverage ^ 0.4 magnitudes deeper 
than previous data releases Q AllWISE contains posi¬ 
tions, apparent motions, magnitudes, and PSE-profile fit 
information for almost 748 million objects. Astromet¬ 
ric calibration of sources in the WISE catalog was done 
by correlation with bright stars from the 2MASS point 
source catalog, and the astrometric accuracy for sources 
in the AllWISE release was further improved by taking 
into account the proper motions of these reference stars, 
taken from t he fourth USNO CC D Astrograph Cata¬ 
log fUCAC4. IZacharias et al.ll2QI3[) . A comparison with 
ICRE sources shows that AllWISE Catalog sources be¬ 
tween 8 < kFI < 12 mag have positional accuracies to 
within 50 mas, and half of these sources have positional 
accuracies to withi n 20 mas. For mo re details on the 
WISE mission, see lWright et al.l dmnh . 


3. AGN SELEGTION 


There are numerous mid-IR color selection criteria in 
the literature, su c h as the Spitzer two-color criteria i n 
iLacy et "ahl(l2004|j.lStern et al.l(|2005f ). lLacv et al.l(|2007[) . 
an d iDonlev et alT(l20I2h : the WISE two-color criteria 


m 


_ Jarrett et al.l (120111). the WISE one- color criteria of 

IStern et all (|20I2[ ) and Assef et ^ (l20I3lf . and the WISE 


two-color criteria of iMateos et al.l (|20I2l) . All of these 
color criteria, while defined using different AGN sub¬ 
samples, are in general agreement with each other, and 
rely on the fact that AGNs separate cleanly from stars 
and star-forming galaxies in mid-IR color space (for a 
recent empirical discussi on of how objects d ifferentiate 
in WISE color space, see lNikntta et ai.|[20I4D . The rea¬ 
son for this separation is because a) stars have nearly 
blackbody SEDs with flux densities dropping at wave¬ 
lengths longer than a few microns, and b) while repro¬ 
cessed photons from dust heating around star formation 
peaks around a few tens of microns, the hard radiation 
field from an AGN accretion disk heats dust in the sur¬ 
rounding torus up to the dust sublimation temperature 
(1,000-1,500 K), leading to a relatively flat power-law 
spectrum that is easily distinguishable from the afore¬ 
mentioned non-AGN SEDs. Importantly, because the 
mid-IR is insensitive to extinction, mid-IR color selec¬ 
tion can pick out heavily-obsc ured or even Com pton- 
thick (A^h > 10^^ cm“^) AGNs (|Mateos et al.|[2QT^ that 


^ The increase in depth is primarily due to the additional cover¬ 
age in the W1 and W2 bands during the Post-Gryo survey phase, 
although photometry in all four bands has been improved due in 
part to better background estimation. 
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are optically indistinguishable fro m star-forming galax - 
ies (see, for example, Figure 1 in iDonlev et al. 120121). 
especially at higher AGN luminosities (|Assef et al.l2QT^ 
iMessias et alJ l2Ql4 IStern et~al] l2Q14l) . This eliminates 
optical selection effects and can yield much larger and 
more statistically complete samples of AGNs. 

In choosing a mid-IR color selection criteria, we re¬ 
quired that the criteria be defined directly from WISE 
data, using only WISE data; this rul e s out the criteria o f 
iLacy et ahl (120041) , IStern et al.l (|2QQ§ ) . iLacv et al J (|2QQ7[ ) . 
and iDonlev et all (|2Q12f ). ^s^^ch apply to Spitzer IRAG 
data. This avoids uncertainties inherent in transforming 
Spitzer magnitudes into WISE magnitudes. T his criteria 
also ru les out the SDSS- WISE color criteria of lWu et al.l 
(I2ni2h . which would limit our sample to the SDSS foot¬ 
print. We further required that the criteria be defined 
directly from a clean, highly reliable sample of AGNs 
and QSOs not defined empirically from WI SE data; this 
rules out the criteria of iJarrett et al.l (|2Qll[) . Finally, we 
required that the criteria be a tw o-color select i on, in volv¬ 
ing IF2-IF3 , whic h excludes the IStern et al.l (|2Q12[ ) and 
lAssef et~al] (|2Q13[) criteria. This is because we are not 
restricting our sample to high Galactic latitudes, so con¬ 
tamination by brown dwarfs may occur if we only use 
W1-W2. We do not know a priori that our sources are 
extragalactic, and some brown dwarfs can share the first 
color axis W1-W2 with AGNs due to methane absorp- 
tion at 3.3um r educing emission in the W1 band (e.g., 
iNoll et aI]l2QQQ[) PI In developing an all-sky catalog of 
mid-infrared zero-proper motion, zero-parallax objects, 
it is key to minimize brown dwarf contamination, as 
many brown dwarfs are within a few pc and therefore 
have very high proper motions, as high as ^ 8" yr“^ 
(|Lnhmanl 120141) . For picking out Spitzer IRAG-selected 
AGNs down to a limiting magnitude of W 2 < 15, the 
WISE on e-color AGN selec tion criteria of IStern et al.l 
(1201^ and lAssef et al.l (|2Q13l) are both very reliable, with 
95% and and > 90% reliability, respectively. However 
these two-color cuts also overlap significantly with the 
mid-IR color space occupied by brown dwarfs. For ex¬ 
ample, using the representative sample of the AllWISE 
catalog described in 44.31 out of 1,0 56 sources identified 
as AGNs using the IStern et al.l (|2012f ) criterion, 62 (5.9%) 
also qualify as brown dwarfs according to the combined 
empirical criterion for dwarfs w ith spectral type ^T5 an d 
nearby L and T-type dwarfs of iKirkoatrick et Hl dml . 
Similarly, of the 1,892 sources identified as AGNs using 
the “R90” criterion of Assef et al.l (l2013h. 59 f3.1%i fall 
within the combined IKirkoatrick et al](|201l[) criterion. 

With these consid erations, we c h ose th e two-color se¬ 
lection criterion of iMateos et al.l (|2Q12[ ). who use the 
Bright Ultrahard XMM-Newton Survey (BUXS), one of 
the largest, flux-hmited samples of ‘ultra hard’ (4.5- 
10 keV) X-ray selected sources, to define a WISE 
AGN selection criterion. This reduces bias against 
heavily absorbed AGNs, and the BUXS is comprised 
mostly (56.2%) of Type 1 AGNs (emission line widths 
> 1500 km s“^); the remainder are almost entirely 


^ See IKirkpatrick et aT 1 ll20T^: ICushing et al ] dmlD; 

IMace et alJ 1120131k ICushing et al.l ll2014l) for some samples of 
brown dwarfs discovered with WISE. 


Type 2 AGNs. It is well established that the mid-IR 
luminosities of AG Ns correlate strongly with their hard 
X-ray luminosities (iLutz et al.ll200'4 iGandhi et al.ll20Q^ 
Mateos et al.l 2015 : ISternI 120151 see also the relation in 
Secrest et al.l l2015f ). as Gompton up-scattering of UV 
photons from the accretion disk into the X-ray regime 
is largely proportional to accretion disk luminosity. In 
terms of overlap with the mid-IR color space shared with 
brown dwarfs, of the 860 sources in our representative 
sample of the Al lWISE catalo g that classify as AGNs 
according to the iMateos et al.l (|2012[) criterion, only 7 
(0.8%) w ould qualify as L or T dw arfs according to the 
combined IKirkoatrick et al.l (|201ll) criteria. 

We take all sources from the AllWISE cat a log fo llow- 
ing equations (3) and (4) from IMateos et all (|2012l) and 
we require that all of our sources have S/N > 5 in the 
first three bands (w l,2,3snr>= 5), as recommended in 
IMateos et al.l (|2012f ) , but as a further constraint we limit 
our results to those with cc.flags = ‘0000’, meaning 
that the sources are unaffected by known artifacts such 
as diffraction spikes, persistence, halos, or optical ghosts. 
We subdivided this query into sources above (5 > 0° and 
source below (5 < 0° to make our queries tractable. We 
concatenated the resultant tables using topcat, version 
4.2-30 Eor the remainder of this paper, we refer to AGNs 
selected in the manner outlined above as mid-IR AGNs 
(MIRAGNs), although we reiterate that there are other 
selection criteria for mid-IR AGNs in the literature. 


4. RESULTS 
4.1. Sample Properties 

Our sample consists of 1,354,775 MIRAGNs spanning 
the full sky. In Eigure[TJ we show a density plot of MI¬ 
RAGNs across the sky, clearly showing that our AGN 
selection criteria is effectively selecting objects outside 
of the Galaxy. Our sample is also relatively uniform 
across the sky. In Eigure [21 we show the source den¬ 
sity of MIRAGNs. By randomly sampling 10^ 1-deg^ 
areas across the sky, we calculate a mean source den¬ 
sity of ^ 38 deg“^, with 10% and 90% thresholds of 
15 deg“^ and 59 deg“^, respectively. We note that the 
reasons from the deviation from a truly uniform distribu¬ 
tion centered at A^/47rsr = 33 deg“^ are the over-density 
of sources at the ecliptic poles (due to deeper WISE cov¬ 
erage) and the under-abundance along the Galactic plane 
(due to source confusion), both effects visible in Eigure [T] 

Almost none of the MIRAGNs in our sample (< 2%) 
have PSE profile fit Xred ^ highest-resolution 

W 1 band, indicating that the vast majority of our sources 
are unresolved by WISE and therefore subtend angu¬ 
lar scales less than ^ 6". In Eigure [3l we show SDSS 
thumbnails of a random selection of 25 sources in our 
sample, demonstrating that the majority of MIRAGNs 
in our sample are indeed compact at optical wavelengths 
as well. 


4.2. Optieal Properties 

In order to characterize the optical properties of our 
sample, we cross-matched it to SDSS-DR12, which is the 

http://www.star.bris.ac.uk/~mbt/topcat/ 
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MIRAGNs 



Fig. 1.— Density plot, Aitoff projection in Galactic coordinates, of our full sample of AGN/QSO sources. The under-density of sources 
along the Galactic plane below |6| < 15° i s due to the AGN c olor criterion effectively excluding stars and other Galactic sources, combined 
with the source confusion limit of WISE (jWright et al.ll2M0 b The increased number density at the ecliptic poles is due to deeper WISE 
coverage. 



deg 


(a) 



Fig. 2. — (a) Normalized histogram of MIRAGNs per deg (b) Gorresponding cumulative histogram. The mean source density is 
38 deg“^, with 10% and 90% thresholds of 15 deg“^ and 59 deg“^, respectively. 
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Fig. 3. — Random sample of 25 sources taken from our A11WISE-SDSSDR12 cross match, showing the typical angular extent of sources 
in our sample. No cuts or photometry requirements were made. 
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FWHM (arcsec) 

Fig. 4.— Distribution of r-band PSF FWHM in our sample cross- 
matched with SDSS DR12 PhotObjAll. Nearly all 193,637 sources 
in this subsample are unresolved, as we would expect. 

final data release of SDSS-III (|Eisenstein et aljr2Qll[) FI 
within a radial tolerance of R < 1", obtaining 424,366 
matches. To determine the fraction of false positive posi¬ 
tion matches (that is, incorrectly correlating an object in 
our sample with a different SDSS object due to random 
positional agreement), we performed the same match on 
a scrambled version of our sample coordinates, deter¬ 
mining that less than 1% of our cross-matches are false 
positive position matches between the two catalogs. 

With our list of cross-matches between our sample and 
SDSS-DR12, we queried the PhotoObjAll table to ob¬ 
tain optical photometric measurements of our sources, 
requiring the clean photometric quality flag be equal 
to 1, and obtained 193,637 sources. We similarly queried 
the SpecObjAll table to obtain spectroscopic redshift in¬ 
formation for our sources, and obtained 39,981 sources. 
The following describes the statistics of the sample based 
on these queries. 

Source Extent: In Figure 01 we show the optical ex¬ 
tents of sources in our sample, given by the psffwhnmr 
parameter. The majority are not extended, with a mean 
FWHM of 1.2 T 0.2 arcsec, coniparab le with the seeing 
limit of SDSS (|Stoughton et al.ll2QQ2f ). further underlin¬ 
ing the power of our mid-IR selection strategy at picking 
out unresolved AGNs/QSOs. 

Distances: In Figure O we show the spectroscopic 
redshifts of MIRAGNs in our sample, which go out to 
about z ^ 3. Note that SDSS spectra come from tar¬ 
geted observations, so this subsample of MIRAGNs suf¬ 
fers from a selection bias and should not be considered to 
be representative of the physical distribution of redshifts 
for the entire sample. 

Magnitudes: In Figure [6l we show the distribution 
of Wl and ^-band magnitudes for our sources. In the 

^ SDSS-DR12 covers 14,555 square degrees, or about 35.3% of 
the sky. 



Fig. 5. — Distribution of spectroscopic redshifts of our sample 
cross-matched with SDSS DR12 SpecObjAll. 



12 14 16 18 20 22 24 26 

mag 


Fig. 6.— Distribution of VFl (Vega) and ^f-band (AB) apparent 
magnitudes. 

mid-IR, 1% of our sources have W1 [3.4/im] magnitudes 
less than 13; 68% have W1 magnitudes less than 16, and 
100% have W1 magnitudes less than 19. Optically, 2% 
of our sources have g magnitudes less than 18; 33% have 
g magnitudes less than 20; 75% have g magnitudes less 
than 22; and 96% have g magnitudes less than 24. 

Note that the ^f-band magnitudes extend well past the 
^ 20 mag Gaia limit. Extrapolating the number of SDSS 
DR12-matched sources with ^-band magnitudes less than 
20 to the entire sky, we predict that this sample contains 
^ 1.8 X 10^ of the AGNs/QSOs that will be observed by 
Gaia. 
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4.3. Stellar Contamination 

In creating a sample of extragalactic sources without 
knowledge of their redshifts, it is vital to be able to ade¬ 
quately avoid contamination by stars, especially for un¬ 
resolved sources. While stars and AGNs occupy com¬ 
pletely different loci in mid-IR color space due to their 
fund amentally different SEDs (see, for example. Figure 
12 in lWright et al.ll^oTol) . we nonetheless aim to quantify 
any possible contamination of our sample by stars. To 
do this, we calculate the number of stars in our sample 
as follows: 

A'c,s,AGN = N ’ fc ' /c,s • /c,s,AGN (1) 

where N is the size of the AllWISE catalog (747,634,026), 
fc is the fraction of the AllWISE catalog that has clean 
WISE photometry as per §3] (for the rest of this work, 
“clean” WISE photometry means this), /c,s is the frac¬ 
tion of those clean sources that are stars, and /c,s,agn is 
the fraction of stars with clean WISE photometry that 
survive the AGN cut we employ. 

To estimate fc , we constructed a representative sample 
of the AllWISE catalog by querying the AllWISE cata¬ 
log with a list of 1 million coordinates randomly sampled 
across the celestial sphere, returning all sources that were 
within a radius of R < 10". This returned 389,424 unique 
sources. Of these, 13,506 have clean WISE photometry, 
or 3.47%. Estimating /c,s can be done by utilizing the 
fact that stars cleanly separate from extragalactic ob¬ 
jects in mid-IR color space, especially in W2-W3. This is 
shown more explicitly in Figure [3 which depicts a clearly 
bimodal distribution of stars and extragalactic objects in 
W2-W3. We select all objects with clean WISE photom¬ 
etry and with W2-W3 < 2 as “stars”, although there is 
some mixing of stars and extragalactic objects between 
W2-W3 ^ 1-2, and elliptical galaxies have star-like mid- 
IR colors, so this selection should be viewed as conserva¬ 
tive. From this, we estimate that approximately 55.6% 
of the sources in the AllWISE catalog with clean pho¬ 
tometry are stars. 

Estimating the value of /c,s,agn can be done by con¬ 
structing an unambiguous sample of stars and then cross¬ 
matching the sample with the AllWISE catalog. We cre¬ 
ated suc h a sample of sta rs from the all-sky PPMXL 
catalog (iRoeser et al.ll2QlQl ). a catalog of ~ 900 million 
sources with optical photometric and astrometric infor¬ 
mation from USNO-Bl.O and 2MASS, which is complete 
down to approximately V ^ 20. We selected all sources 
in the PPMXL catalog with 5-band magnitude less than 
12, as this excludes all AGNs and most extragalactic ob¬ 
jects in the PPMXL catalog 0 We further required that 
the absolute values of the proper motions in RA and Dec 
be less than 150 mas yr“^, as this avoids spurious en¬ 
tries in the PPMXL catalog. We then cross-matched this 
sample of stars to the AllWISE catalog within R < 1", 
returning 499,724 sources with clean WISE photometry. 
Of these, only 20 (0.0040%) fall within our AGN selec¬ 
tion criterion0 To estimate the number of sources that 

® As a quality control, we required that the difference in B mag¬ 
nitude between the first and second epochs of USNO-Bl.O be less 
than 0.5 mag. 

^ Because of the brighter magnitudes of our star sample, satura- 



W2- lV3(mag) 

Fig. 7. — Distribution of W2-W3 for a representative sample of 
sources from the AllWISE catalog, randomly sampled across the 
celestial sphere. 

would have fallen within our AGN criterion by chance 
mismatch, we cross-matched a scrambled version of our 
list of star coordinates to the AllWISE catalog, returning 
3,626 “matches” within R < 1". Of these, 7 fall within 
our AGN selection criterion, implying that many, if not 
most of the 20 stars falling within our AGN criterion 
are actually chance mismatches. We therefore consider 
0.0040% to be an upper limit to the percentage of stars 
with WISE colors following our criterion. 

Multiplying these fractions together, A^c,s,agn ~ 580. 
Given our conservative estimates described above, we in¬ 
terpret this as the upper limit of the total number of 
AllWISE-observed stars that leak into our sample. With 
1.4 X 10^ sources in our MIRAGN sample, the expected 
contamination of the total sample by stars is therefore 
< 0.041%. We conclude that nearly 100% of our sample 
is uncontaminated by stars. We term this the “reliabil¬ 
ity” of our sample. It is important to note that the types 
of stars found in PPMXL, an optical catalog, are by no 
means the same types of stars found in AllWISE and 
many infrared-bright stars, such as L and T-type dwarfs, 
are under-represented in our star sample. We have cho¬ 
sen a two-color mid-IR AGN selection criterion partly to 
avoid contamination by brown dwarfs, but we empha¬ 
size that our reliability analysis pertains specifically to 
optical survey work. 

4.4. Completeness 

While the primary objective of this study is to obtain 
a reliable sample of extragalactic sources using mid-IR 
color-selected AGNs, it is nonetheless useful to explore 

tion effects in the corresponding AllWISE data become important. 
To avoid stars with spurious AllWISE magnitudes, we removed 
any with VFI < 8, VF2 < 7, and VF3 < 3.8. This removed 23 stars 
that would have been classified as AGNs according to our criterion. 
This is not a significant effect for our sample of AGNs, however, 
affecting less than 0.092% of our sample. 






















LQAC-2 



W2-W3 (mag) 

(a) 


Fig. 8. — (a) Mid-IR col or-co lor plot for R < 19 sources in LQAC-! 
(120121) demarcation. (See N4.4D 

the statistical completeness of our sample. To do this, we 
used the second rel ease of the Large Qu asar Astrometric 
Catalog (LQAC-2: ISouchav et al.l[2Q12f ). which contains 
187,504 quasars, including radio-selected quasars from 
the ICRF2, optically-selected quasars from SDSS, and 
infrared-selected quasars from 2MASS; and so thus rep¬ 
resents a robust sample of quasars over a wide range of 
wavelengths. 

After cross-matching with AllWISE, we find that 
93,403 quasars from LQAC-2 have clean detections. The 
majority of non-detections are due to sources in LQAC-2 
that are too faint, having R > 190 Of the 61,377 sources 
in LQAC-2 brighter than this limit, 51,618 (84.10%) 
have clean detections with WISE. Of these, 46,928 are 
MIRAGNs, or 90.91%. Broken down by wavelength - 
based source catalog (see Table 1 in ISouchav et al.l[2012[ ): 

radio: 84.00% (15,441/18,391) 

near-IR: 82.47% (15,973/19,368) 

optical0 90.91% (46,927/51,617) 

With a magnitude-limited sample, mid-IR AGN classi¬ 
fication is therefore quite complete, even for AGNs se¬ 
lected from radio surveys. Finally, we note that of the 
3,414 ICRF2 sources, 1,364 have clean WISE detections 
and R < 19. Of these, 1,219 (89.40%) are MIRAGNs. 

® 166,033 quasars in LQAC-2 have R-band magnitudes in the 
catalog, and 104,656 (63.03%) have R > 19. 

® The difference of 1 between this denominator and the full 
sample is du e to the R magnitu des deriving from complementary 
USNO-Bl.O (|Monet et al.l[200^ data, so one source in LQAC-2 
is a purely radio-determined AGN with a cross-identification in 
USNO-Bl.O: most sources in LQAC-2 have data from multiple 
wavelengths. 


DR12Q 



0.0 


W2-W3 (mag) 

(b) 

;; (b) For DR12Q with g < 20. The black lines are the lMateos et all 

It is of interest to compare the completeness of MI¬ 
RAGNs with the number of quasars expected to be 
discovered by Gaia down to a magnitude of F = 20 
(^ 5 X 10^, fMignardll2QI2[ ). To do this, we performed 
a similar analysis using the DRI2Q catalog from SDSS, 
which contains 297,301 quasars, 44,831 of which have 
clean WISE detections. The majority of non-detections 
is again due to a limiting magnitude of about ^ < 20. 
Of those clean detections with g < 20, 38,915 (86.8%) 
are MIRAGNs. Using the ^-band as a proxy for U, 
23,906/27,093 (88.2%) of cleanly-detected sources with 
^ < 20 are MIRAGNs. Extrapolating over the whole sky 
(the BOSS survey covers ^ 9.2 x 10^ deg^), 8.3 x 10^ 
MIRAGNs in our sample with ^ < 20 are outside the 
SDSS footprint and are therefore expected to be new. 

It is worth discussing the reasons why ^ 10% — 20% 
of AGNs in LQAG-2 and DR12Q within our magni¬ 
tude limit are not recovered in our sample of MIRAGNs. 
In Figure [8] we show the mid-IR color distribution of 
(a) the LQAG-2 catalog, and (b) the DRI2Q catalog. 
Most sources that do not meet the criteria are bluer in 
their W1-W2 colors, which is due to two effects. First, 
lower ratios of AGN/host galaxy luminos ities preclude 
i nclus ion by our criteria. For example, iMateos et al.l 
(|2QI2f ) found that, above a hard X-ray luminosity of 
L 2-10 keV > 10^^ erg s“^, 97.1% of type 1 AGNs (emis¬ 
sion line widths > 1500 km s“^) and 76.5% of type 2 
AGNs (emission line widths < 1500 km s“^) fall within 
their criteria. Below L 2-10 keV < 10^^ erg s“^, those 
percentages are 84.4% and 39.1%, respectively, and con¬ 
tamination by the host galaxy becomes more evident. 
At lower redshifts {z < 0.5), a larger fraction of AGNs in 
both DR12Q but especially LQAG-2 have bl^ier W1-W2 
colors that fall outside the iMateos et al ] (I 2 ni 2 ti criteria 
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Fig. 9. — Distribution of redshifts for AGNs in the LQAC-2 and 
DR12Q catalogs meeting the WISE photometry requirements out¬ 
lined in JS] and brighter than R < 19 and g < 20, respectively. 

(Figure [9]). To show that this is due to lower AGN/host 
galaxy luminosities, we calculated the rest-frame 5 GHz 
luminosities of AGNs with radio data in LQAG-2 by as¬ 
suming power-law radio SEDs of the form fy oc 
and calculating a directly from the catalog 2 GHz and 
5 GHz spectral energy densities. On average, AGNs in 
LQAG-2 have power law indices of a = 0.3 ± 0.6, in 
line with the flat-spectrum radio SEDs typically seen in 
AGNs. Using a to correct for redshift, we find that for 
sources below a redshift of 2 ; < 0.5 the mean 5 GHz lu¬ 
minosity is I/5ghz = 2.5 X 10^^ erg s“^, while for sources 
above a redshift of z > 0.5 the mean 5 GHz luminosity is 
Lsghz = 7.9 X 10^^ erg s“^. AGNs at higher redshift in 
LQAG-2 are therefore much more dominant, and so tend 
to manifest as MIR AGNs 0 

Second, at very high redshifts {z > 2), the SEDs 
of even pure AGNs begin to move out of the mid-IR 
color -color demarcation (see Eigure 5 in iMateos et al.l 
l2012f ). This effect can be seen in Eigure [9l and affects 
the DR12Q catalog especially due to the higher average 
redshift ( 2 ; = 2.1) of AGNs in the catalog. 

If we exclude sources in the LQAG-2 with 2 ; > 2 and 
R > 19, 93.0% of the remaining sources are MIRAGNs. 
If we make the same redshift cutoff for sources in DR12Q 
with g > 20, 98.2% of the remaining sources are MI¬ 
RAGNs. We note that the many (32.7% and 51.0%, re¬ 
spectively) of the high redshift sources excluded in our 
sample would have been include d if we had used th e one- 
color criteria {W1-W2 > 0.8) of IStern et all (|2012[) . The 
AGN criteria outlined in m thus minimizes leakage of 
stars into our sample at the cost of some missed AGNs 
(i.e., false negatives, or “typ e H erro rs”). 

Einally, iDiPompeo et al l (|2015f ) recently published 

Although more luminous AGNs may also be preferen- 
tial ly found in more lum inous host galaxies. See, for exam¬ 
ple [HamUtoFTr^] ( 120081 k 


a catalog of over 5.5 million quasar candidates with 
SDSS+ WISE photornetry. Extending the XDQSOz 
model of iBovv et al.l (|2Q12[) to include WISE photom¬ 
etry, and using a training set of spectroscopically con¬ 
firmed quasars, they assigned quasar probabilities Pqso 
for unresolved sources from SDSS DR8, as well as com¬ 
puting photometric redshifts, which they find to be sig¬ 
nificantly improved by the addition of WISE photom¬ 
etry. We cross-matched our sample to their catalog to 
within R < 1", as before, finding 227,011 matches. We 
found several reasons for non-matches, which we outline 
below. 

The first and most significant reason for non-matches 
with our sample is the limiting magnitude. Of the 
^ 3.7 million sources in their catalog with high qual¬ 
ity SPSS photometry (PSE ^ S/N > 5), only ^ 9% have 
^ < 2013 The second reason for non-matches is the 
differing photometry requirements between our sample 
and the XDQSOz catalog. Of the 197,635 sources in the 
XDQSOz catalog with ^ < 20 not in our sample, 91,711 
are not in the All WISE catalog at all, likely due to the 
independent “forced photometry” performed on WISE 
data for the XDQSOz^ Of the sources not in our sam¬ 
ple that are in the AllWISE catalog, only 5,993 fulfill our 
photometry quality requirements as outlined in §3l Eor 
the sources in XDQSOz that are in the AllWISE cata¬ 
log with clean WISE photometry brighter than ^ < 20, 
130,420/136,413 (95.6%) are in our sample. Einally, a 
third reason for non-matches is again the redshift limita¬ 
tion of our chosen AGN selection criterion of 2 ; ^ 2. If we 
retain only sources with 2 ; < 2, ^ < 20, and with clean 
WISE photometry, 110,756/112,138 (98.8%) of sources 
in XDQSOz are in our sample. We note that relaxing our 
match radius to P < 3" does not significantly alter our 
results, yielding a total of 236,963 matches between our 
sample and the XDQSOz catalog, with 98.4% of sources 
in XDQSOz with 2 ; < 2, ^ < 20, and with clean WISE 
photometry in our sample. 

4.5. New AGNs 

To estimate how many sources in our sample are ex¬ 
pected to be previously uncatalogued AGNs, we cross- 
matched our sample to the Million Quasars (MILLI- 
QUAS) Gatalog, version 4.0 a heterogenous compi¬ 
lation of all known or candidate AGNs/QSOs through 
May 2015, which contains 1,153,110 entries 13 H in¬ 
cludes quasar data from the NASA/IPAG Extragalactic 
Database (NED), SIMBAD, and SDSS-DR12Q. Gross¬ 
matching our source list to the MILLIQUAS catalog to 
within R < 1" yields 202,203 sources, however, in or¬ 
der to more completely assess the number of sources 

These are Pogson magnitudes we calculated as explaiued iu 
https://www.sdss3.org/dr8/algorithms/magnitudes.php 

Withiu a 1" radius; relaxiug the radius to 3" yields au addi- 
tioual 14,476 caudidate sources iu the AllWISE catalog. 

http://quasars.org/milliquas.htm 

MILL IQUAS includes the Half Million Quasars (HMQ) 
Gatalogue (|Fleschl 120151k but the latter excludes candidate 
AGNs/QSOs. Of AGNs in the HMQ, 55.33% of sources with 
R < 19 are MIRAGNs, suggesting a prevalence of non-dominant, 
low-redshift sources or sources at high redshift. Indeed, by further 
excluding sources in the HMQ with ^ < 0.5 and 2 ; > 2.0 (see ^4.41) . 
79.19% of sources in the HMQ are MIRAGNs. 
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TABLE 1 
MIRAGN Data 


AllWISE 

RA 

Dec 

W1-W2 

W2-W3 

W1 

9 

z 

z-type 

J005947.42-011010.8 

14.947587 

-1.169694 

1.26 

2.74 

14.73 

18.82 

1.123 

s 

J005939.81-011445.7 

14.915884 

-1.246029 

0.73 

2.99 

14.81 

21.35 

0.457 

s 

J010016.47-020912.8 

15.068651 

-2.153577 

1.31 

3.57 

15.15 


2.009 

s 

J005858.94-010459.4 

14.745614 

-1.083188 

1.01 

2.92 

15.12 




J005847.49-010549.7 

14.697876 

-1.097150 

0.71 

2.76 

12.26 


0.047 

s 

J010210.49-015630.0 

15.543743 

-1.941685 

0.92 

2.38 

14.74 


0.6 

p 

J010323.03-012637.0 

15.846000 

-1.443626 

1.19 

3.19 

14.79 




J010032.21-020046.0 

15.134236 

-2.012790 

1.11 

3.35 

12.86 


0.227 

s 

J005847.77-020808.6 

14.699062 

-2.135742 

1.27 

2.89 

15.48 




J005936.26-013003.8 

14.901093 

-1.501077 

1.14 

3.51 

15.17 

19.46 

1.016 

s 

J005814.49-011507.0 

14.560381 

-1.251954 

0.99 

2.61 

14.39 




J010220.08-005743.5 

15.583695 

-0.962098 

1.31 

2.95 

15.07 




J005804.98-015015.7 

14.520752 

-1.837710 

0.88 

3.37 

13.84 


0.239 

s 

J010246.00-012600.5 

15.691669 

-1.433499 

1.38 

3.50 

15.31 




J010248.37-021532.9 

15.701556 

-2.259148 

1.65 

3.77 

15.38 




J010304.14-010040.1 

15.767260 

-1.011162 

0.90 

2.89 

15.24 




J010114.58-021141.1 

15.310782 

-2.194774 

1.02 

3.70 

16.04 




JOlOOlO.93-014909.5 

15.045565 

-1.819306 

1.05 

3.43 

15.66 




J010003.46-015427.7 

15.014454 

-1.907698 

1.58 

4.19 

17.25 




J010249.02-010545.0 

15.704260 

-1.095849 

1.20 

2.68 

15.01 

19.89 

1.038 

s 

J005956.19-010722.6 

14.984129 

-1.122969 

1.13 

3.74 

15.94 




J005957.93-005311.4 

14.991392 

-0.886508 

1.46 

3.37 

15.85 




J010129.88-010515.9 

15.374504 

-1.087752 

1.62 

3.22 

15.15 




J010104.16-005918.6 

15.267358 

-0.988513 

1.40 

3.71 

15.84 




J005939.26-014846.1 

14.913610 

-1.812812 

1.41 

2.99 

14.98 





Note. — “AllWISE” is the “designation” column in the AllWISE catalog; RA and Dec are the 
AllWISE catalog coordinates, in J2000. ^f-band magnitudes come from LQAC-2 where available, 
DR12Q else. Redshifts come preferentially from LQAC-2, then the non-flagged pipeline redshifts 
from DR12Q, then MILLIQUAS. Photometric redshifts from MILLIQUAS are flagged in z-type as 
‘p’. The remaining columns are the unique string identifiers found in the matched catalogs, but have 
been omitted from this printing for space. The full version of this table will be made available online. 


in our sample already in the MILLIQUAS catalog, we 
relaxed our cross-match radius to R < 10", obtaining 
210,534 matches. To estimate the level of contamina¬ 
tion at this cross-match radius by random matches, we 
cross-matched a scrambled version of our sample coordi¬ 
nates with MILLIQUAS, obtaining 976 matches within 
R < 10", a contamination level of 0.46%. Expanding 
our match radius to R < 30" only produces an addi¬ 
tional 6,216 matches, and an unacceptable level of con¬ 
tamination by random matches of 4.0%. Our sample of 
mid-IR selected AGNs is therefore expected to contain 
approximately 1.1 million uncatalogued AGNs. We give 
an example of our sample, cross-matched to the LQAG- 
2, DR12Q, and MILLIQUAS catalogs to within R < 1", 
in Table [H 

5. CONCLUSIONS 

We have explored the use of RTS'E-selected AGNs to 
derive a highly reliable sample of extragalactic sources 
for astrometric purposes. Our primary conclusions are 
as follows: 

1. Using the two-color AGN criteria of iMateos et al.l 
(|2Q12n and strict photometric quality requirements, 
we derive a sample of 1,354,775 mid-IR AGNs 
from the AllWISE source catalog. Approximately 
1.1 million of these were previously uncatalogued. 

2. Erom a reliability analysis using 499,724 stars from 
the PPMXL catalog, we estimate that the fraction 
of stars in our sample is extremely small, < 0.041%, 


and conclude that this technique is extremely reli¬ 
able. 

3. The use of mid-IR color selection results in a high 
level of completeness, and we estimate that our 
sample contains ^ 8.3 x 10^ AGNs/QSOs with g- 
band magnitudes below < 20 that fall outside the 
SDSS footprint, a significant fraction 17%) of 
the number of QSOs expected to be discovered by 
Gaia. 

In a subsequent paper, we will use the sample de¬ 
rived here to look for optical signatures of previously 
undetected AGN using multi-epoch URAT observations. 
URAT is an all-sky astrometric survey in the visible, 
and is a follow-up project to the previous UGAG pro¬ 
gram. Utilizing the red-lens from the UGAG program, 
the telescope has been completely redesigned. The new 
4-shooter camera consist of four large 10,560 by 10,560 
pixel GGDs, with a combined single exposure covering 28 
square degrees of the sky at a resolutio n of 0.9" pix~^. 
The newly released URATl catalog (|Zacharias et al.l 
I 2 OI 5 I : IZacharia^ l2015f ) contains accurate positions (typ¬ 
ically 10 to 30 mas std. error) of 220 million stars in 
the 3 to 18.5 magnitude range, mainly in the north¬ 
ern hemisphere. Proper motions have been obtained for 
85% of these stars utilizing the Two Micron All Sky 
Survey (2MASS) as first epoch. URATl is also sup¬ 
plemented by 2MASS and AAVSO Photometric All-Sky 
Survey (APASS) photometry. We will characterize the 
astrometric and photometric variability of the AGNs we 
detect with URAT, we will provide an optical catalog of 
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these objects and their derived properties, and we will 
explore the utility of sources identified in this paper for 
future ICRF work. 
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